Sentimetre Model 2 Long-Short BackTest
Sentimetre Model 2 Long-Short BackTest
• 8 min read
**Backtest:
We use a long-short equally-weighted portfolio backtest for all our models. Other papers are tended to use the top 10 long predictions and top 10 short predictions to build a portfolio but we prefer to include all predictions in our portfolio. We assume that we are able to buy at market open and liquidate at market close. We don’t take into account transaction costs and slippage as we did not have the adequate resources.
**Proof-of-Concept 1: Reuters dataset 1 2017-2020
Data is segmented into training data (2017-2018) and test data (2019-2020). Preprocessing of the text data for text normalization, stemming, lemmatization and extraction of stop words.
*Model accuracy on the validation dataset:
NTLK VADER Sentiment Analyzer - N/A Linear Classifier - 53% Sentimetre Model 1 - 53% Sentimetre Model 2 - 57%
*Prediction accuracy on the test dataset:
NTLK VADER Sentiment Analyzer - 50% Linear Classifier - 52% Sentimetre Model 1 - 51% Sentimetre Model 2 - 55%
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:28: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:34: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:39: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:96: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:104: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:9: RuntimeWarning: invalid value encountered in double_scalars if __name__ == '__main__': C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:52: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:60: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:9: RuntimeWarning: invalid value encountered in double_scalars if __name__ == '__main__': C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:19: RuntimeWarning: invalid value encountered in double_scalars C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:53: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:61: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_a.head(50)
| Unnamed: 0 | Unnamed: 0.1 | Unnamed: 0.1.1 | Unnamed: 0.1.1.1 | Unnamed: 0.1.1.1.1 | Unnamed: 0.1.1.1.1.1 | Unnamed: 0.1.1.1.1.1.1 | Unnamed: 0.1.1.1.1.1.1.1 | Unnamed: 0.1.1.1.1.1.1.1.1 | level_0 | ... | returnpredvader | returnpredsgd | dailyaveragereturn | dailyaveragereturnvader | dailyaveragereturnsgd | cumreturn1b | cumreturn1d | cumreturn1e | cumreturndow | cumreturnsp500 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | ... | -3.147169 | 3.147169 | 1.001532 | 1.014097 | 0.988675 | 1.001532 | 1.014097 | 0.988675 | 0.975243 | 0.971729 |
| 1 | 112 | 112 | 112 | 112 | 112 | 112 | 112 | 112 | 112 | 112 | ... | -3.348837 | 3.348837 | 1.020781 | 1.002606 | 1.010641 | 1.022345 | 1.016740 | 0.999195 | 1.008729 | 1.003723 |
| 2 | 153 | 153 | 153 | 153 | 153 | 153 | 153 | 153 | 153 | 153 | ... | -5.204246 | -5.204246 | 0.994020 | 0.995286 | 0.981924 | 1.016231 | 1.011947 | 0.981134 | 1.015801 | 1.007929 |
| 3 | 239 | 239 | 239 | 239 | 239 | 239 | 239 | 239 | 239 | 239 | ... | 0.929615 | -0.929615 | 0.999709 | 0.991541 | 1.002966 | 1.015935 | 1.003387 | 0.984044 | 1.025649 | 1.018899 |
| 4 | 284 | 284 | 284 | 284 | 284 | 284 | 284 | 284 | 284 | 284 | ... | -0.929615 | -0.929615 | 1.001278 | 0.990225 | 1.004563 | 1.017233 | 0.993579 | 0.988534 | 1.029852 | 1.022825 |
| 5 | 349 | 349 | 349 | 349 | 349 | 349 | 349 | 349 | 349 | 349 | ... | 12.653061 | -12.653061 | 1.009986 | 1.005615 | 0.990158 | 1.027392 | 0.999158 | 0.978805 | 1.034506 | 1.028085 |
| 6 | 403 | 403 | 403 | 403 | 403 | 403 | 403 | 403 | 403 | 403 | ... | -0.319614 | -0.319614 | 1.011374 | 1.008723 | 1.012217 | 1.039077 | 1.007874 | 0.990763 | 1.034354 | 1.027829 |
| 7 | 464 | 464 | 464 | 464 | 464 | 464 | 464 | 464 | 464 | 464 | ... | -0.690608 | 0.690608 | 1.003347 | 1.003949 | 1.004104 | 1.042555 | 1.011854 | 0.994829 | 1.028916 | 1.024141 |
| 8 | 532 | 532 | 532 | 532 | 532 | 532 | 532 | 532 | 532 | 532 | ... | 0.585652 | -0.585652 | 1.035546 | 1.052613 | 1.002859 | 1.079613 | 1.065090 | 0.997673 | 1.039948 | 1.030812 |
| 9 | 617 | 617 | 617 | 617 | 617 | 617 | 617 | 617 | 617 | 617 | ... | 2.448730 | 2.448730 | 1.008794 | 1.010841 | 1.011141 | 1.089107 | 1.076637 | 1.008787 | 1.042258 | 1.036876 |
| 10 | 690 | 690 | 690 | 690 | 690 | 690 | 690 | 690 | 690 | 690 | ... | -0.699153 | -0.699153 | 1.005840 | 1.010511 | 0.995994 | 1.095468 | 1.087953 | 1.004746 | 1.050171 | 1.043855 |
| 11 | 760 | 760 | 760 | 760 | 760 | 760 | 760 | 760 | 760 | 760 | ... | 3.989354 | -3.989354 | 0.993924 | 1.008535 | 0.988490 | 1.088812 | 1.097239 | 0.993181 | 1.064015 | 1.058258 |
| 12 | 832 | 832 | 832 | 832 | 832 | 832 | 832 | 832 | 832 | 832 | ... | -3.862661 | 3.862661 | 0.997719 | 0.990693 | 0.997142 | 1.086328 | 1.087027 | 0.990343 | 1.051262 | 1.052659 |
| 13 | 915 | 915 | 915 | 915 | 915 | 915 | 915 | 915 | 915 | 915 | ... | 1.791656 | 1.791656 | 1.003994 | 0.997071 | 0.998687 | 1.090667 | 1.083843 | 0.989042 | 1.052709 | 1.051700 |
| 14 | 992 | 992 | 992 | 992 | 992 | 992 | 992 | 992 | 992 | 992 | ... | 0.193986 | -0.193986 | 1.004965 | 0.976378 | 1.006797 | 1.096082 | 1.058241 | 0.995765 | 1.061645 | 1.059580 |
| 15 | 1052 | 1052 | 1052 | 1052 | 1052 | 1052 | 1052 | 1052 | 1052 | 1052 | ... | -2.180621 | 2.180621 | 1.009515 | 1.007491 | 0.990477 | 1.106511 | 1.066167 | 0.986283 | 1.053314 | 1.050628 |
| 16 | 1128 | 1128 | 1128 | 1128 | 1128 | 1128 | 1128 | 1128 | 1128 | 1128 | ... | -0.236616 | -0.236616 | 1.000961 | 1.022304 | 1.016057 | 1.107574 | 1.089947 | 1.002120 | 1.051780 | 1.052845 |
| 17 | 1223 | 1223 | 1223 | 1223 | 1223 | 1223 | 1223 | 1223 | 1223 | 1223 | ... | -0.527778 | 0.527778 | 0.999108 | 0.996710 | 0.988060 | 1.106587 | 1.086361 | 0.990155 | 1.068135 | 1.071473 |
| 18 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | ... | -1.090757 | 1.090757 | 0.995765 | 0.993474 | 1.001761 | 1.101900 | 1.079272 | 0.991898 | 1.077318 | 1.070822 |
| 19 | 1480 | 1480 | 1480 | 1480 | 1480 | 1480 | 1480 | 1480 | 1480 | 1480 | ... | -1.308060 | 1.308060 | 1.019066 | 0.998929 | 0.995950 | 1.122909 | 1.078115 | 0.987881 | 1.078286 | 1.073573 |
| 20 | 1549 | 1549 | 1549 | 1549 | 1549 | 1549 | 1549 | 1549 | 1549 | 1549 | ... | -1.421801 | -1.421801 | 0.998408 | 0.999468 | 0.993334 | 1.121121 | 1.077541 | 0.981296 | 1.085593 | 1.081089 |
| 21 | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | 1614 | ... | -0.600387 | 0.600387 | 0.996565 | 0.999695 | 1.003741 | 1.117270 | 1.077213 | 0.984967 | 1.090704 | 1.088463 |
| 22 | 1703 | 1703 | 1703 | 1703 | 1703 | 1703 | 1703 | 1703 | 1703 | 1703 | ... | 0.390259 | 0.390259 | 1.010500 | 1.006733 | 1.006129 | 1.129000 | 1.084466 | 0.991004 | 1.088278 | 1.087554 |
| 23 | 1798 | 1798 | 1798 | 1798 | 1798 | 1798 | 1798 | 1798 | 1798 | 1798 | ... | -0.101482 | 0.101482 | 1.006278 | 1.002703 | 1.006155 | 1.136088 | 1.087398 | 0.997104 | 1.078095 | 1.078098 |
| 24 | 1889 | 1889 | 1889 | 1889 | 1889 | 1889 | 1889 | 1889 | 1889 | 1889 | ... | 2.001191 | 2.001191 | 1.019696 | 1.006804 | 0.999160 | 1.158464 | 1.094796 | 0.996266 | 1.078824 | 1.075391 |
| 25 | 1959 | 1959 | 1959 | 1959 | 1959 | 1959 | 1959 | 1959 | 1959 | 1959 | ... | 3.975535 | 3.975535 | 1.024688 | 1.026296 | 0.986022 | 1.187064 | 1.123585 | 0.982340 | 1.079589 | 1.073111 |
| 26 | 2007 | 2007 | 2007 | 2007 | 2007 | 2007 | 2007 | 2007 | 2007 | 2007 | ... | -7.181572 | 7.181572 | 0.990992 | 0.988986 | 1.001652 | 1.176371 | 1.111210 | 0.983963 | 1.093505 | 1.089073 |
| 27 | 2069 | 2069 | 2069 | 2069 | 2069 | 2069 | 2069 | 2069 | 2069 | 2069 | ... | 0.216160 | 0.216160 | 0.992587 | 0.991376 | 0.987666 | 1.167650 | 1.101627 | 0.971828 | 1.096812 | 1.094106 |
| 28 | 2161 | 2161 | 2161 | 2161 | 2161 | 2161 | 2161 | 2161 | 2161 | 2161 | ... | -1.390728 | -1.390728 | 1.010441 | 1.001136 | 1.004938 | 1.179841 | 1.102878 | 0.976627 | 1.093903 | 1.089657 |
| 29 | 2240 | 2240 | 2240 | 2240 | 2240 | 2240 | 2240 | 2240 | 2240 | 2240 | ... | -0.767712 | 0.767712 | 1.011594 | 0.996391 | 1.009238 | 1.193520 | 1.098898 | 0.985649 | 1.105804 | 1.108669 |
| 30 | 2320 | 2320 | 2320 | 2320 | 2320 | 2320 | 2320 | 2320 | 2320 | 2320 | ... | -14.599483 | 14.599483 | 0.998742 | 0.999037 | 0.997596 | 1.192019 | 1.097839 | 0.983279 | 1.109429 | 1.111718 |
| 31 | 2393 | 2393 | 2393 | 2393 | 2393 | 2393 | 2393 | 2393 | 2393 | 2393 | ... | -1.509872 | -1.509872 | 0.992845 | 1.001974 | 1.002645 | 1.183490 | 1.100006 | 0.985880 | 1.105517 | 1.107272 |
| 32 | 2464 | 2464 | 2464 | 2464 | 2464 | 2464 | 2464 | 2464 | 2464 | 2464 | ... | 1.635323 | 1.635323 | 1.010289 | 1.003458 | 1.008498 | 1.195667 | 1.103810 | 0.994258 | 1.112604 | 1.115032 |
| 33 | 2527 | 2527 | 2527 | 2527 | 2527 | 2527 | 2527 | 2527 | 2527 | 2527 | ... | 0.116429 | 0.116429 | 0.997867 | 0.995095 | 0.998959 | 1.193116 | 1.098395 | 0.993223 | 1.113975 | 1.117608 |
| 34 | 2591 | 2591 | 2591 | 2591 | 2591 | 2591 | 2591 | 2591 | 2591 | 2591 | ... | 3.185596 | 3.185596 | 1.005740 | 1.003953 | 1.001200 | 1.199964 | 1.102737 | 0.994415 | 1.113094 | 1.116153 |
| 35 | 2692 | 2692 | 2692 | 2692 | 2692 | 2692 | 2692 | 2692 | 2692 | 2692 | ... | -0.125849 | -0.125849 | 1.007319 | 0.997495 | 1.009109 | 1.208747 | 1.099975 | 1.003473 | 1.112489 | 1.113034 |
| 36 | 2774 | 2774 | 2774 | 2774 | 2774 | 2774 | 2774 | 2774 | 2774 | 2774 | ... | -4.263094 | 4.263094 | 1.005833 | 0.995293 | 0.985575 | 1.215797 | 1.094798 | 0.988998 | 1.109345 | 1.110072 |
| 37 | 2886 | 2886 | 2886 | 2886 | 2886 | 2886 | 2886 | 2886 | 2886 | 2886 | ... | 0.342727 | -0.342727 | 1.018847 | 1.004337 | 1.023670 | 1.238711 | 1.099546 | 1.012407 | 1.116995 | 1.114797 |
| 38 | 2964 | 2964 | 2964 | 2964 | 2964 | 2964 | 2964 | 2964 | 2964 | 2964 | ... | -0.709939 | -0.709939 | 1.019083 | 1.010996 | 1.015796 | 1.262350 | 1.111637 | 1.028399 | 1.112660 | 1.105945 |
| 39 | 3013 | 3013 | 3013 | 3013 | 3013 | 3013 | 3013 | 3013 | 3013 | 3013 | ... | -0.451904 | -0.451904 | 0.995565 | 0.999405 | 0.999499 | 1.256751 | 1.110975 | 1.027885 | 1.111401 | 1.105387 |
| 40 | 3078 | 3078 | 3078 | 3078 | 3078 | 3078 | 3078 | 3078 | 3078 | 3078 | ... | 0.092807 | 0.092807 | 0.998999 | 0.995991 | 1.003584 | 1.255494 | 1.106521 | 1.031569 | 1.104150 | 1.099683 |
| 41 | 3140 | 3140 | 3140 | 3140 | 3140 | 3140 | 3140 | 3140 | 3140 | 3140 | ... | 10.352188 | -10.352188 | 0.996598 | 1.011136 | 1.000641 | 1.251223 | 1.118843 | 1.032230 | 1.095178 | 1.091106 |
| 42 | 3205 | 3205 | 3205 | 3205 | 3205 | 3205 | 3205 | 3205 | 3205 | 3205 | ... | 1.081211 | -1.081211 | 1.004553 | 1.005661 | 1.003162 | 1.256920 | 1.125177 | 1.035494 | 1.092844 | 1.090122 |
| 43 | 3252 | 3252 | 3252 | 3252 | 3252 | 3252 | 3252 | 3252 | 3252 | 3252 | ... | 0.551914 | -0.551914 | 0.996361 | 0.993332 | 0.999931 | 1.252347 | 1.117674 | 1.035423 | 1.108871 | 1.098716 |
| 44 | 3338 | 3338 | 3338 | 3338 | 3338 | 3338 | 3338 | 3338 | 3338 | 3338 | ... | -0.418035 | 0.418035 | 1.015368 | 0.995008 | 1.001636 | 1.271593 | 1.112094 | 1.037117 | 1.112146 | 1.094594 |
| 45 | 3421 | 3421 | 3421 | 3421 | 3421 | 3421 | 3421 | 3421 | 3421 | 3421 | ... | 6.149846 | -6.149846 | 1.001564 | 0.995438 | 0.989997 | 1.273582 | 1.107021 | 1.026743 | 1.119875 | 1.100943 |
| 46 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | ... | 0.336409 | -0.336409 | 1.000199 | 0.998797 | 1.003925 | 1.273836 | 1.105689 | 1.030772 | 1.118903 | 1.101245 |
| 47 | 3590 | 3590 | 3590 | 3590 | 3590 | 3590 | 3590 | 3590 | 3590 | 3590 | ... | 1.845763 | 1.845763 | 0.999817 | 1.000833 | 1.001073 | 1.273603 | 1.106610 | 1.031879 | 1.124481 | 1.107196 |
| 48 | 3646 | 3646 | 3646 | 3646 | 3646 | 3646 | 3646 | 3646 | 3646 | 3646 | ... | -1.300822 | -1.300822 | 1.003085 | 0.997391 | 1.002216 | 1.277531 | 1.103723 | 1.034166 | 1.128648 | 1.109990 |
| 49 | 3728 | 3728 | 3728 | 3728 | 3728 | 3728 | 3728 | 3728 | 3728 | 3728 | ... | 1.770495 | 1.770495 | 1.003434 | 0.998701 | 0.993998 | 1.281918 | 1.102289 | 1.027958 | 1.128500 | 1.108846 |
50 rows × 78 columns